Rethinking FPGA Computing with a Many-Core Approach
نویسندگان
چکیده
While ASIC design and manufacturing costs are soaring with each new technology node, the computing power and logic capacity of modern FPGAs steadily advances. Therefore, high-performance computing with FPGA-based system becomes increasingly attractive and viable. Unfortunately, truly unleashing the computing potential of FPGAs often stipulates cumbersome HDL programming and laborious manual optimization. To circumvent such challenges, we propose a Many-core Approach to Reconfigurable Computing (MARC) that (i) allows programmers to easily express parallelism through a high-level programming language, (ii) supports coarse-grain multithreading and dataflowstyle fine-grain threading while permitting bit-level resource control, and (iii) greatly reduces the effort required to repurpose the hardware system for different algorithms or different applications. Leveraging a many-core architectural template, sophisticated logic synthesizing techniques, and state-of-art compiler optimization technology, a MARC system enables efficient highperformance computing for applications expressed with imperative programming languages such as C/C++ by exploiting abundant special FPGA resources such as distributed block memories and DSP blocks to implement complete single-chip high efficiency many-core microarchitectures. To quantitatively validate the proposed MARC system, we implemented a MARC prototype machine consisting of one control processing core and 32 arithmetic processing cores using a Virtex-5 (XCV5LX155T-2) FPGA. For a well-known general-purpose Bayesian computing problem, we compare the throughput and runtime of this MARC machine, with fully synthesized application-specific processing cores, against a manually optimized FPGA implementation—BCM (Bayesian Computing Machine) [1]. As the problem sizes range from 10 to 10, this MARC machine achieve 8.13 GFLOPS in throughput on average, which is 43% of that of BCM but with much less design/implementation effort and much greater portability and retargetability. More importantly, we developed a simple analytical performance model to explain the performance discrepancy between the MARC machine and the hand-optimized BCM FPGA implementation [1].
منابع مشابه
A Literature Review on Cloud Computing Security Issues
The use of Cloud Computing has increasedrapidly in many organization .Cloud Computing provides many benefits in terms of low cost and accessibility of data. In addition Cloud Computing was predicted to transform the computing world from using local applications and storage into centralized services provided by organization.[10] Ensuring the security of Cloud Computing is major factor in the Clo...
متن کاملA Literature Review on Cloud Computing Security Issues
The use of Cloud Computing has increasedrapidly in many organization .Cloud Computing provides many benefits in terms of low cost and accessibility of data. In addition Cloud Computing was predicted to transform the computing world from using local applications and storage into centralized services provided by organization.[10] Ensuring the security of Cloud Computing is major factor in the Clo...
متن کاملNeuro-fuzzy control of bilateral teleoperation system using FPGA
This paper presents an adaptive neuro-fuzzy controller ANFIS (Adaptive Neuro-Fuzzy Inference System) for a bilateral teleoperation system based on FPGA (Field Programmable Gate Array). The proposed controller combines the learning capabilities of neural networks with the inference capabilities of fuzzy logic, to adapt with dynamic variations in master and slave robots and to guarantee good prac...
متن کاملFPGA Implementation of a Hammerstein Based Digital Predistorter for Linearizing RF Power Amplifiers with Memory Effects
Power amplifiers (PAs) are inherently nonlinear elements and digital predistortion is a highly cost-effective approach to linearize them. Although most existing architectures assume that the PA has a memoryless nonlinearity, memory effects of the PAs in many applications ,such as wideband code-division multiple access (WCDMA) or orthogonal frequency-division multiplexing (OFDM), can no longer b...
متن کاملImplementation of VlSI Based Image Compression Approach on Reconfigurable Computing System - A Survey
Image data require huge amounts of disk space and large bandwidths for transmission. Hence, imagecompression is necessary to reduce the amount of data required to represent a digital image. Thereforean efficient technique for image compression is highly pushed to demand. Although, lots of compressiontechniques are available, but the technique which is faster, memory efficient and simple, surely...
متن کامل